Scalable Techniques for Computing Band Linear Recurrences on Massively Parallel and Vector Supercomputers
نویسندگان
چکیده
In this paper, we present a new scalable algorithm, called the Regular Schedule, for parallel evaluation of band linear recurrences (BLR's, i.e., mth-order linear recurrences for m 1). Its scalability and simplicity make it well suited for vector supercomputers and massively parallel computers. We describe our implementation of the Regular Schedule on two types of machines: the Convex C240 and the MasPar MP-2. The scalability of our scheduling techniques is demonstrated on the two machines. Signi cant improvements in CPU performance for a range of programs containing BLR implemented using the Regular Schedule in C over the same programs implemented using the highly-optimized coded-in-assembly BLAS routines [17] are demonstrated on the Convex C240. We also demonstrate the scalability of this schedule on the MasPar MP-2 up to two thousand processors. Our approach can be used both at the user level in parallel programming code containing BLR's, and in compiler parallelization of such programs combined with recurrence recognition techniques for massively parallel and vector supercomputers.
منابع مشابه
Computing Programs Containing Band Linear Recurrences on Vector Supercomputers
Many large-scale scienti c and engineering computations, e.g., some of the Grand Challenge problems [1], spend a major portion of execution time in their core loops computing band linear recurrences (BLR's). Conventional compiler parallelization techniques [4] cannot generate scalable parallel code for this type of computation because they respect loop-carried dependences (LCD's) in programs an...
متن کاملParallel Programming Models and Paradigms
In the 1980s it was believed computer performance was best improved by creating faster and more e cient processors. This idea was challenged by parallel processing, which in essence means linking together two or more computers to jointly solve a computational problem. Since the early 1990s there has been an increasing trend to move away from expensive and specialized proprietary parallel superc...
متن کاملAtmosperic Data Assimilation on Distributed-Memory Parallel Supercomputers
Atmospheric data such as temperature, moisture, winds, etc., collected by satellites and direct measuements from upper-air instruments, ground observation stations provide only partial information about the atmosphere. They are assimilated to numerical forecasts to provide a coherent, evolving state of the global atmosphere. The data analysis system, the Physical-space Statistical Analysis Syst...
متن کاملArchitecture, implementation and parallelization of the software to search for periodic gravitational wave signals
The parallelization, design and scalability of the PolGrawAllSky code to search for periodic gravitational waves from rotating neutron stars is discussed. The code is based on an efficient implementation of the F -statistic using the Fast Fourier Transform algorithm. To perform an analysis of data from the advanced LIGO and Virgo gravitational wave detectors’ network, which will start operating...
متن کاملParallel Computing – Different Approaches
One of the key contributors to the explosion of the information and Internet age is the ability to parallelize computation. The theoretical limits of computation with a single machine are long superseded by massively scalable parallel computers. This paper discusses about the multiple different levels where parallelism can be achieved. Two of the most game-changing technologies are discussed he...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1994